Attacking LLMs
Learn to identify and exploit LLM vulnerabilities, covering prompt injection, insecure output handling, and model poisoning.
In this module, we cover practical attacks against systems that use large language models, including prompt injection, unsafe output handling, and model poisoning . You will learn how crafted inputs and careless handling of model output can expose secrets or trigger unauthorised actions, and how poisoned training data can cause persistent failures. Each topic includes hands-on exercises and realistic scenarios that show how small issues can be linked into larger attack paths. By the end, participants can build concise proof of concept attacks and suggest clear, practical mitigations.
Input Manipulation & Prompt Injection
Understand the basics of LLM Prompt Injection attacks.
LLM Output Handling and Privacy Risks
Learn how LLMs handle their output and the privacy risks behind it.
Data Integrity & Model Poisoning
Understand how supply chain and model poisoning attacks can corrupt the underlying LLM.
Juicy
A friendly golden retriever who answers your questions.
BankGPT
A customer service assistant used by a banking system.
HealthGPT
A safety-compliant AI assistant that has strict rules against revealing sensitive internal data.